9 research outputs found
Információ-visszakeresési modellek elméletének és alkalmazási lehetőségeinek kutatása, Web metakereső (fúzió), magyar nyelvű tesztkollekció, nyelvközi keresés = Theoretical and practical research into information retrieval models, Web metasearch (fusion), Hungarian test collection, cross-language retrieval
Megadtuk a kapcsolat alapĂş Web-visszkeresĹ‘ mĂłdszerek egysĂ©ges formális keretĂ©t. Ăšj kapcsolatokra mutattunk rá az informáciĂłvisszakeresĂ©s Ă©s: - informáciĂłelmĂ©let, - számelmĂ©let, - nyelvtechnolĂłgia, - orvostudomány, - bonyolultságelmĂ©let, - logika között. Megmutattuk, hogy az asszociatĂv visszakeresĹ‘ mĂłdszer átlagos hatákonysága 0,6. MĂłdszert adtunk meg WebkeresĹ‘motor hatĂ©konyságának mĂ©rĂ©sĂ©re. EntrĂłpia alapĂş indexkifejezĂ©s-kiválasztĂł eljárást adtunk meg, Ă©s megmutattuk, hogy ilyen mĂłdon a vektortĂ©r visszakeresĹ‘ mĂłdszer hatĂ©konysága növelhetĹ‘. KifejlesztettĂĽk az i2rMeta Ă©s a NeuRadIR keresĹ‘ rendszereket. KifejlesztettĂĽnk egy angol nyelvű orvosi tesztadatbázist, ennek segĂtsĂ©gĂ©vel mĂ©rtĂĽk a NeuRadIR rendszer hatĂ©konyságát. KifejlesztettĂĽnk hat magyar nyelvű tesztadatbázist, ezeket a kisvilág jelensĂ©g Ă©s az asszociatĂv mĂłdszer vizsgálatában használtuk fel. EredmĂ©yneink tananyag rĂ©szeivĂ© váltak a Pannon Egyetem Műszaki Informatikai Karán (B.Sc Ă©s Ph.D. kĂ©pzĂ©sben), a megfelelĹ‘ jegyzetek a hallgatĂłk számára (de bármely Ă©rdeklĹ‘dĹ‘ számára is) ingyenesen elĂ©rhetĹ‘k. A Pannon Egyetemen kĂvĂĽl az eredmĂ©nyek tananyag rĂ©szeit kĂ©pezik a következĹ‘ egyetemeken is: Joint Advanced Student School MĂĽnchen, Germany; University of Colorado at Denver, USA; Eidgenossische Technische Hochschule ZĂĽrich, Schweiz. | A Unified formal framework for the link-based methods was given. Links between information retrieval and information theory, number theory, language technology, medicine, computational complexity and logics were established. A new method for the measurement of retrieval effectiveness of Web search engines was given. The i2rMeta and NeuRadIR retrieval systems were developed. An English and six Hungarian test databases were developed for laboratory measuremnents of effectiveness. Many of our results have become part of instruction programs at Pannon University, Joint Advanced Student School MĂĽnchen, Germany; University of Colorado at Denver, USA; Eidgenossische Technische Hochschule ZĂĽrich, Schweiz
Semantic distillation: a method for clustering objects by their contextual specificity
Techniques for data-mining, latent semantic analysis, contextual search of
databases, etc. have long ago been developed by computer scientists working on
information retrieval (IR). Experimental scientists, from all disciplines,
having to analyse large collections of raw experimental data (astronomical,
physical, biological, etc.) have developed powerful methods for their
statistical analysis and for clustering, categorising, and classifying objects.
Finally, physicists have developed a theory of quantum measurement, unifying
the logical, algebraic, and probabilistic aspects of queries into a single
formalism. The purpose of this paper is twofold: first to show that when
formulated at an abstract level, problems from IR, from statistical data
analysis, and from physical measurement theories are very similar and hence can
profitably be cross-fertilised, and, secondly, to propose a novel method of
fuzzy hierarchical clustering, termed \textit{semantic distillation} --
strongly inspired from the theory of quantum measurement --, we developed to
analyse raw data coming from various types of experiments on DNA arrays. We
illustrate the method by analysing DNA arrays experiments and clustering the
genes of the array according to their specificity.Comment: Accepted for publication in Studies in Computational Intelligence,
Springer-Verla
Computational Aspects of Connectionist Interaction Information Retrieval
Connectionism represents a soft computing technique that aims at enhancing retrieval effectiveness, and is, at the same time, very computation demanding. In IR, only recently has computational complexity of retrieval algorithms become a research issue, although its practical importance has long been recognized. The paper presents a methodical study of the computational complexity of a connectionist retrieval algorithm, the Associative Interaction retrieval method. After a short description of the method itself, the complexity of weights computation and "winner-takes-all"-based activation spreading (i.e., retrieval) are established. This is followed by an empirical estimate of the probability to have multiple maxima, and by an asymptotic estimate of the probability to have unique maximum